Sample Trace: Deriving Fast Approximation for Repetitive Queries

نویسندگان

  • Feng Yu
  • Wen-Chi Hou
  • Cheng Luo
چکیده

Repetitive queries refer to those queries that are likely to be executed repeatedly in the future. Queries such as those used to generate periodic reports, perform routine summarization and data analysis belong to this category. Repetitive queries can constitute a large portion of the daily activities of a database system, and thus deserve extra optimization efforts. In this paper, we propose to record information about how tuples are joined in a repetitive query, called the query trace. We prove that the query trace is sufficient to compute the exact selectivities of joins for all plans of a given query. To reduce the space and time overheads in generating the query trace, we propose to construct only a sample of the query trace, called a sample trace, which can be much smaller than a (complete) query trace. A special operation, called a sample outer join, is designed to accomplish this feat. Accurate estimations of join selectivities, with associated confidence intervals, can be derived easily using the sample trace. Extensive experiments show that the sample trace can be constructed efficiently and be a controllable trade-off between accuracy and efficiency in estimations of join selectivities for repetitive queries. Keywords-query optimization, query re-optimization, trace, sampling method, sample trace

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Approximate Range Queries by Histograms in Olap

Online analytical processing applications typically analyze a large amount of data by means of repetitive queries involving aggregate measures on such data. In fast OLAP applications, it is often advantageous to provide approximate answers to queries in order to achieve very high performances. A way to obtain this goal is by submitting queries on compressed data in place of the original ones. H...

متن کامل

Approximate Furthest Neighbor with Application to Annulus Query

Much recent work has been devoted to approximate nearest neighbor queries. Motivated by applications in recommender systems, we consider approximate furthest neighbor (AFN) queries and present a simple, fast, and highly practical data structure for answering AFN queries in high-dimensional Euclidean space. The method builds on the technique of Indyk (SODA 2003), storing random projections to pr...

متن کامل

Fast Approximation of Self-Similar Network Traffic

Recent network traffic studies argue that network arrival processes are much more faithfully modeled using statistically self-similarprocesses instead of traditional Poisson processes [LTWW94a, PF94]. One difficulty in dealing with selfsimilar models is how to efficiently synthesize traces (sample paths) corresponding to self-similar traffic. We present a fast Fourier transform method for synth...

متن کامل

Deriving Fuzzy Inequalities Using Discrete Approximation of Fuzzy Numbers

Most of the researches in the domain of fuzzy number comparisons serve the fuzzy number ordering purpose. For making a comparison between two fuzzy numbers, beyond the determination of their order, it is needed to derive the magnitude of their order. In line with this idea, the concept of inequality is no longer crisp however it becomes fuzzy in the sense of representing partial belonging or de...

متن کامل

Popularity-Based Ranking for Fast Approximate kNN Search

Similarity searching has become widely available in many on-line archives of multimedia data. Users accessing such systems look for data items similar to their specific query object and typically refine results by re-running the search with a query from the results. We study this issue and propose a mechanism of approximate kNN query evaluation that incorporates statistics of accessing index da...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014